Skip to content

Fixes some theoretical races and retentions in AsyncOperation and ManualResetValueTaskSource#127256

Open
VSadov wants to merge 4 commits intodotnet:mainfrom
VSadov:vtsRaces
Open

Fixes some theoretical races and retentions in AsyncOperation and ManualResetValueTaskSource#127256
VSadov wants to merge 4 commits intodotnet:mainfrom
VSadov:vtsRaces

Conversation

@VSadov
Copy link
Copy Markdown
Member

@VSadov VSadov commented Apr 22, 2026

Possibly fixes: #127234

  • Completion sees AsyncOperation in completed state, very rarely on ARM64. Seems like an ordering issue, since it is on ARM64.
    I see one case where it seems we miss a fence. Not sure if it is related. The actual failure is very hard to repro on purpose.

Possibly fixes: #126735

  • Test observes object rooting on runtime async.

While logically runtime async captures the same state, there are some implementation details like in the {_continuation, _continuationState} tuple, the continuation may retain less, but state may retain more, compared to regular async. There are code paths that clear continuation, but do not clear the state, thus can observe the difference.
We should fix that regardless, even if #126735 is caused by something else.

The main idea is that continuation/state are needed only until the continuation runs. Retaining longer than that in poolable/reusable containers (i.e. until container is reused) can potentially lead to observable issues like last used connection not disposing/closing if no further activity in the app.

Same applies to the captured context - as soon as we do not need it, it would not hurt to clear fields holding to the context. No problem was observed due to this, but just in case...

This PR also makes AsyncOperation and ManualResetValueTaskSource to be slightly more similar. Mostly by using the completion sentinel as the only way to signal completion + some minor tweaks, cross-copying comments,...

Copilot AI review requested due to automatic review settings April 22, 2026 01:08
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @VSadov
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the completion/continuation signaling mechanics for System.Threading.Channels.AsyncOperation and ManualResetValueTaskSourceCore<TResult> to reduce theoretical races and retained-state rooting by making completion primarily signaled via a sentinel and clearing captured state more aggressively.

Changes:

  • Switch completion detection to use a volatile read of the continuation sentinel.
  • Refactor OnCompleted race handling to queue continuations directly and clear stored context/state earlier.
  • Add ClearRetainedState hooks for pooled async operations to reduce retained references.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/libraries/System.Threading.Channels/src/System/Threading/Channels/AsyncOperation.cs Adjusts completion detection/OnCompleted race path and introduces retained-state clearing for pooled operations.
src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Sources/ManualResetValueTaskSourceCore.cs Removes _completed flag, uses sentinel-based completion, and refactors OnCompleted/SignalCompletion to align with the new completion model.
Comments suppressed due to low confidence (1)

src/libraries/System.Threading.Channels/src/System/Threading/Channels/AsyncOperation.cs:283

  • SetCompletionAndInvokeContinuation sets _continuation = s_completedSentinel via a normal write, while completion is now observed via Volatile.Read(ref _continuation) in IsCompleted/GetStatus. To avoid weak-memory visibility issues (e.g., other threads seeing completion without reliably seeing prior _error/_result writes, or seeing a stale non-sentinel value), set the sentinel using a release operation (Volatile.Write or an interlocked write) in both the non-EC and EC paths.
                Action<object?> c = _continuation!;
                _continuation = s_completedSentinel;
                object? state = _continuationState;
                _continuationState = null;
                c(state);

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 22, 2026 01:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/libraries/System.Threading.Channels/src/System/Threading/Channels/AsyncOperation.cs:283

  • IsCompleted now uses Volatile.Read(ref _continuation), but _continuation is set to s_completedSentinel via a plain write in SetCompletionAndInvokeContinuation. Consider using Volatile.Write (or Interlocked.Exchange) when publishing the completion sentinel so other threads that observe completion via IsCompleted reliably see prior writes (e.g., _error / result) and don't observe stale state.
            object? ctx = _capturedContext;
            _capturedContext = null;

            ExecutionContext? ec =
                ctx is null ? null :
                ctx as ExecutionContext ??
                (ctx as CapturedSchedulerAndExecutionContext)?._executionContext;

            if (ec is null)
            {
                Action<object?> c = _continuation!;
                _continuation = s_completedSentinel;
                object? state = _continuationState;
                _continuationState = null;
                c(state);

_continuation = null;
Debug.Assert(_continuationState == null);
Debug.Assert(_capturedContext == null);
_continuationState = null;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still clearing _continuationState and _capturedContext here, not relying on assertions since the type is public.
The assertions are there to catch internal misuse.


Action<object?>? continuation =
Volatile.Read(ref _continuation) ??
Interlocked.CompareExchange(ref _continuation, ManualResetValueTaskSourceCoreShared.s_sentinel, null);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the MRVTS becomes officially completed. That is - unless it already has a continuation, in such case it becomes completed a few lines later, right before we deal with continuation.

There is no need for _completed field.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same pattern as used in AsyncOperation

@@ -139,7 +139,7 @@ private CancellationToken CancellationToken
#endif

/// <summary>Gets whether the operation has completed.</summary>
internal bool IsCompleted => ReferenceEquals(_continuation, s_completedSentinel);
internal bool IsCompleted => ReferenceEquals(Volatile.Read(ref _continuation), s_completedSentinel);
Copy link
Copy Markdown
Member Author

@VSadov VSadov Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performing an ordinary read could allow prefetching _error and we could end up assuming that a failed/cancelled operation actually cleanly completed.

@VSadov
Copy link
Copy Markdown
Member Author

VSadov commented Apr 22, 2026

I think this is ready for a review. Wasm failures appear unrelated.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment on lines +247 to +248
// and this write is after that.
_continuation = ManualResetValueTaskSourceCoreShared.s_sentinel;
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SignalCompletion, _continuation = ManualResetValueTaskSourceCoreShared.s_sentinel; is a non-volatile write. Since IsCompleted uses Volatile.Read(ref _continuation) to publish/observe completion, this should be a releasing write (e.g., Volatile.Write or an interlocked exchange) to ensure other threads can't observe completion before _result / _error are visible (especially on weak memory models like ARM64).

Suggested change
// and this write is after that.
_continuation = ManualResetValueTaskSourceCoreShared.s_sentinel;
// and this write is after that. Use a volatile write so observers that
// use Volatile.Read(ref _continuation) can't observe completion before
// the preceding writes are visible.
Volatile.Write(ref _continuation, ManualResetValueTaskSourceCoreShared.s_sentinel);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in the comments the interlocked operation that stores continuation already orders this write the way we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AF: The continuation was the completion sentinel. Possible object rooting issue with runtime-async

2 participants